{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Copyright 2019 The Kubeflow Authors. All Rights Reserved.\n",
    "#\n",
    "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "# you may not use this file except in compliance with the License.\n",
    "# You may obtain a copy of the License at\n",
    "#\n",
    "#      http://www.apache.org/licenses/LICENSE-2.0\n",
    "#\n",
    "# Unless required by applicable law or agreed to in writing, software\n",
    "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "# See the License for the specific language governing permissions and\n",
    "# limitations under the License.\n",
    "# =============================================================================="
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deploying a Kubeflow Cluster on Google Cloud Platform (GCP)\n",
    "This notebook provides instructions for setting up a Kubeflow cluster on GCP using the command-line interface (CLI). For additional help, see the guide to [deploying Kubeflow using the CLI](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/).\n",
    "\n",
    "There are two possible alternatives:\n",
    "- The first alternative is to deploy Kubeflow cluster using the [Kubeflow deployment web app](https://deploy.kubeflow.cloud/), and the instruction can be found [here](https://www.kubeflow.org/docs/gke/deploy/deploy-ui/).\n",
    "- Another alternative is to use recently launched [AI Platform Pipeline](https://cloud.google.com/ai-platform/pipelines/docs/introduction). But, it is important to note that the AI Platform Pipeline is a standalone [Kubeflow Pipeline](https://www.kubeflow.org/docs/pipelines/overview/pipelines-overview/) deployment, where a lot of the components in full Kubeflow deployment won't be pre-installed. The instruction can be found [here](https://cloud.google.com/ai-platform/pipelines/docs/setting-up).\n",
    "\n",
    "The CLI deployment gives you more control over the deployment process and configuration than you get if you use the deployment UI."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prerequisites\n",
    "- You have a [GCP project setup](https://www.kubeflow.org/docs/gke/deploy/project-setup/) for your Kubeflow Deployment with you having the [owner role](https://cloud.google.com/iam/docs/understanding-roles#primitive_role_definitions) for the project and with the following APIs enabled:\n",
    "    - [Compute Engine API](https://pantheon.corp.google.com/apis/library/compute.googleapis.com)\n",
    "    - [Kubernetes Engine API](https://pantheon.corp.google.com/apis/library/container.googleapis.com)\n",
    "    - [Identity and Access Management(IAM) API](https://pantheon.corp.google.com/apis/library/iam.googleapis.com)\n",
    "    - [Deployment Manager API](https://pantheon.corp.google.com/apis/library/deploymentmanager.googleapis.com)\n",
    "    - [Cloud Resource Manager API](https://pantheon.corp.google.com/apis/library/cloudresourcemanager.googleapis.com)\n",
    "    - [AI Platform Training & Prediction API](https://pantheon.corp.google.com/apis/library/ml.googleapis.com)\n",
    "- You have set up [OAuth for Cloud IAP](https://www.kubeflow.org/docs/gke/deploy/oauth-setup/)\n",
    "- You have installed and setup [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)\n",
    "- You have installed [gcloud-sdk](https://cloud.google.com/sdk/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running Environment\n",
    "\n",
    "**This notebook helps in creating the Kubeflow cluster on GCP. You must run this notebook in an environment with Cloud SDK installed, such as Cloud Shell. Learn more about [installing Cloud SDK](https://cloud.google.com/sdk/docs/).**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting up a Kubeflow cluster\n",
    "1. Download kfctl\n",
    "2. Setup environment variables\n",
    "3. Create dedicated service account for deployment\n",
    "4. Deploy Kubefow\n",
    "5. Install Kubeflow Pipelines SDK\n",
    "6. Sanity check"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a working directory\n",
    "Create a new working directory in your current directory. The default name is **kubeflow**, but you can change the name."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "work_directory_name = 'kubeflow'\n",
    "\n",
    "! mkdir -p $work_directory_name\n",
    "\n",
    "%cd $work_directory_name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download kfctl\n",
    "Download kfctl to your working directory. The default version used is v0.7.0, but you can find the latest release [here](https://github.com/kubeflow/kubeflow/releases)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Download kfctl v0.7.0\n",
    "! curl -LO https://github.com/kubeflow/kubeflow/releases/download/v0.7.0/kfctl_v0.7.0_linux.tar.gz\n",
    "    \n",
    "## Unpack the tar ball\n",
    "! tar -xvf kfctl_v0.7.0_linux.tar.gz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**If you are using AI Platform Notebooks**, your environment is already authenticated. Skip the following cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Create user credentials\n",
    "! gcloud auth application-default login"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up environment variables\n",
    "Set up environment variables to use while installing Kubeflow. Replace variable placeholders (for example, `<VARIABLE NAME>`) with the correct values for your environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set your GCP project ID and the zone where you want to create the Kubeflow deployment\n",
    "%env PROJECT=<ADD GCP PROJECT HERE>\n",
    "%env ZONE=<ADD GCP ZONE TO LAUNCH KUBEFLOW CLUSTER HERE>\n",
    "\n",
    "# google cloud storage bucket\n",
    "%env GCP_BUCKET=gs://<ADD STORAGE LOCATION HERE>\n",
    "\n",
    "# Use the following kfctl configuration file for authentication with \n",
    "# Cloud IAP (recommended):\n",
    "uri = \"https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml\"\n",
    "uri = uri.strip()\n",
    "%env CONFIG_URI=$uri\n",
    "\n",
    "# For using Cloud IAP for authentication, create environment variables\n",
    "# from the OAuth client ID and secret that you obtained earlier:\n",
    "%env CLIENT_ID=<ADD OAuth CLIENT ID HERE>\n",
    "%env CLIENT_SECRET=<ADD OAuth CLIENT SECRET HERE>\n",
    "\n",
    "# Set KF_NAME to the name of your Kubeflow deployment. You also use this\n",
    "# value as directory name when creating your configuration directory. \n",
    "# For example, your deployment name can be 'my-kubeflow' or 'kf-test'.\n",
    "%env KF_NAME=<ADD KUBEFLOW DEPLOYMENT NAME HERE>\n",
    "\n",
    "# Set up name of the service account that should be created and used\n",
    "# while creating the Kubeflow cluster\n",
    "%env SA_NAME=<ADD SERVICE ACCOUNT NAME TO BE CREATED HERE>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Configure gcloud and add kfctl to your path."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! gcloud config set project ${PROJECT}\n",
    "\n",
    "! gcloud config set compute/zone ${ZONE}\n",
    "\n",
    "\n",
    "# Set the path to the base directory where you want to store one or more \n",
    "# Kubeflow deployments. For example, /opt/.\n",
    "# Here we use the current working directory as the base directory\n",
    "# Then set the Kubeflow application directory for this deployment.\n",
    "\n",
    "import os\n",
    "base = os.getcwd()\n",
    "%env BASE_DIR=$base\n",
    "\n",
    "kf_dir = os.getenv('BASE_DIR') + \"/\" + os.getenv('KF_NAME')\n",
    "%env KF_DIR=$kf_dir\n",
    "\n",
    "# The following command is optional. It adds the kfctl binary to your path.\n",
    "# If you don't add kfctl to your path, you must use the full path\n",
    "# each time you run kfctl. In this example, the kfctl file is present in\n",
    "# the current directory\n",
    "new_path = os.getenv('PATH') + \":\" + os.getenv('BASE_DIR')\n",
    "%env PATH=$new_path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create service account\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! gcloud iam service-accounts create ${SA_NAME}\n",
    "! gcloud projects add-iam-policy-binding ${PROJECT} \\\n",
    "  --member serviceAccount:${SA_NAME}@${PROJECT}.iam.gserviceaccount.com \\\n",
    "  --role 'roles/owner'\n",
    "! gcloud iam service-accounts keys create key.json \\\n",
    "  --iam-account ${SA_NAME}@${PROJECT}.iam.gserviceaccount.com"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set GOOGLE_APPLICATION_CREDENTIALS"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "key_path = os.getenv('BASE_DIR') + \"/\" + 'key.json'\n",
    "%env GOOGLE_APPLICATION_CREDENTIALS=$key_path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup and deploy Kubeflow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! mkdir -p ${KF_DIR}\n",
    "%cd $kf_dir\n",
    "! kfctl apply -V -f ${CONFIG_URI}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Install Kubeflow Pipelines SDK"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture\n",
    "\n",
    "# Install the SDK (Uncomment the code if the SDK is not installed before)\n",
    "! pip3 install 'kfp>=0.1.36' --quiet --user"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sanity Check: Check the ingress created"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! kubectl -n istio-system describe ingress"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Access the Kubeflow cluster at **`https://<KF_NAME>.endpoints.<gcp_project_id>.cloud.goog/`**\n",
    "\n",
    "Note that it may take up to 15-20 mins for the above url to be functional."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
